Learning Shallow Syntactic Dependencies from Imbalanced Datasets: A Case Study in Modern Greek and English

نویسندگان

  • Argiro Karozou
  • Katia Kermanidis
چکیده

The present work aims to create a shallow parser for Modern Greek subject/object detection, using machine learning techniques. The parser relies on limited resources. Experiments with equivalent input and the same learning techniques were conducted for English, as well, proving that the methodology can be adjusted to deal with other languages with only minor modifications. For the first time, the class imbalance problem concerning Modern Greek syntactically annotated data is successfully addressed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining

This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...

متن کامل

Learning Subcategorization Frames from Corpora: a Case Study for Modern Greek

Certain Natural Language Processing (NLP) applications such as parsing and semantic processing require complete lexicons that provide subcategorization information for a word of interest, i.e. the necessary information about the set(s) of syntactic constituents the word must combine with, in order for its meaning to be fully expressed. Modern Greek presents high flexibility in the allowable ord...

متن کامل

Online Processing of English Wh-Dependencies by Iranian EFL Learners

To be able to reach the level of ultimate attainment in the second language, learners need to acquire not only the grammar of the L2 but also the language processing mechanisms involved in the comprehension of sentences in real time. Contrary to its importance, very little is known yet about online L2 processing. This study examines whether advanced Iranian learners of English reactivate disloc...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

Learning of relative clauses by L3 learners of English

In surveys of third language acquisition (TLA) research, mixed results demonstrate that there is no consensus among researchers regarding the advantages and/or disadvantages of bilinguality on  TLA.  The  main  concern  of  the  present  study  was,  thus,  to  probe  the  probable  differences between  Persian  monolingual  and  Azeri-Persian  bilingual  learners  of  English  regarding  their...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011